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Abstract 

We study here the impact of priorities on conflict resolution in incon- 
sistent relational databases. We extend the framework of [1], which is 
based on the notions of repair and consistent query answer. We propose 
a set of postulates that an extended framework should satisfy and con- 
sider two instantiations of the framework: (locally preferred) f -repairs and 
(globally preferred) ^-repairs. We study the relationships between them 
and the impact each notion of repair has on the computational complexity 
of repair checking and consistent query answers. 



1 Introduction 

The main purpose of integrity constraints is to express semantic properties of 
the data stored in the database. Usually, it is the database management system 
that is responsible for maintaining the integrity of the database. However, in 
many recent applications the integrity enforcement becomes a problematic is- 
sue. For example in the data integration setting, even when the data contained 
by a data source satisfies the integrity constrains, a different data source may 
contribute conflicting information. At the same time data sources may be au- 
tonomous and it may be impossible to modify their contents in order to remove 
the conflicts. Integrity constraints may also fail to be enforced because of effi- 
ciency considerations. Finally, in the case of long running operations, integrity 
violations may be only temporary and will be eliminated by further operations. 

Typically, the user formulates a query with the assumption that the database 
is consistent (i.e. satisfies the integrity constraints). A simple evaluation of the 
query over an inconsistent database may return incorrect answers. To address 
this problem Arenas, Bertossi, and Chomicki [1] proposed the framework of 
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consistent query answers. They introduced the notion of a repair, a consistent 
database that is minimally different from the original one. A consistent answer 
to a query is an answer true in every repair. The framework of [1] is used as a 
foundation for most of the work in the area of querying inconsistent databases 
[2, 3, 7, 5, 11, 15, 14, 4]. 

Example 1.1. Consider a database consisting of two tables Emp and Mgr 
whose instance I can be found in Table 1. 



Emp 



Name 


Dept 


Alice 
Alice 


A 
B 



Mgr 

Dept I Name I T 

~A~ Mary 2 

B Bob 1 

B Mary 3 



Table 1: Instance Iq 

Assume that we have two functional dependencies Emp : Name Dept and 
Mgr : Dept — > Name. This database contains two conflicts: 1) in relation Emp 
between the tuples (Alice, A) and (Alice, B); 2) in relation Mgr between the 
tuples (B, Mary, 3) and (B, Bob, 1) (Note that one person can be the manager 
of more than one department). Each of those conflicts can be resolved in two 
different ways by assuming that one tuple is correct and removing the other. 
This leads to four different repairs: 

h = {Emp(Alice, A), Mgr(A, Mary, 2), Mgr(B, Bob, 1)}, 
h = {Emp(Alice, B), Mgr(A, Mary, 2), Mgr(B, Bob, 1)}, 
h = {Emp(Alice, A), Mgr(A, Mary, 2), Mgr(B, Mary, 3)}, 
7 4 = {Emp(Alice, B), Mgr(A, Mary, 2), Mgr(B, Mary, 3)}. 

For example, the repair I\ is obtained by assuming that Alice works in depart- 
ment A and the manager of department B is Bob. Since in every repair Mary 
is the manager of the department A, we can infer that true is the consistent 
answer to the query 

<j>i = Mgr(A, Mary). 

However it is not certain that Alice works in a department managed by Mary, 
i.e. true is not the consistent answer to the following query 

(j} 2 = 3x.Emp(Alice, x) A Mgr(x, Mary). 

This is because of the repair I 2 , where </>2 is false. 

As it is shown in the previous example, each conflict can be resolved in two 
different ways. The framework of [1] does not provide any means to favor one 
way over another. However, in many cases some additional information which 
can be used to provide a resolution of some conflicts is available. For example: 
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• In c-commcrce applications, data are accompanied with the timestamp of 
creation/last modification — the conflicts can be resolved by removing 
from consideration old, outdated tuples. 

• In data integration scenarios, it is often possible to provide a (partial) 
order on the sources, capturing the reliability of contributed information 

— the most reliable data can be used to resolve conflicts. 

• Statistics can be used to resolve conflicts created by misspellings. 

Example 1.2 (cont. Example 1.1). Suppose that the column T of the table 
Mgr contains for each tuple its creation timestamp (lower values correspond to 
older tuples) . We can use this information to express the preference that if some 
tuples of Mgr are conflicting, the older should be removed from consideration 
(but not removed from the database). Since the tuple (B,Bob, 1) is older than 
(B, Mary, 3), we consider only the repairs containing the latter one: ^3 and 
I4. In such a case we can also infer that it is certain that Alice works in the 
department managed by Mary, i.e. true is the preferred consistent answer to 
the query ^2. 

In this paper we extend the framework of consistent query answers with an 
additional input consisting of preference information $. We use $ to define the 
set of preferred repairs Rep*. When we compute consistent answers, instead of 
considering the set of all repairs Rep, we use the set of preferred repairs. We 
assume that there exists a (possibly partial) operation of extending $ with some 
additional preference information and we write $ C \£» when 4" is an extension 
of We consider <!> to be maximal when it cannot be extended further. The 
main objective of our research is to develop a framework of preferred repairs 
that fulfills the following postulates: 

1. Non-emptiness 

(VI) Rep* ^ 0. 

2. Non-discrimination: if no preference information is given, then no re- 
pair is removed from consideration 

(V2) Rep = Rep . 

3. Monotonicity: extending preferences can only narrow the set of preferred 
repairs 

(V3) $ C * Rep* C Rep* . 

4. Categoricity: given maximal preference information we obtain exactly 
one repair 

(P4) $ is maximal =>• | Rep* = 1. 
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We note here that the postulates VI and V2 together imply an important prop- 
erty of conservativeness: preferred repairs are a subset of the standard repairs. 

Another important goal of our research is to determine the computational 
implications of introducing preferences. For this purpose we study here two 
fundamental decision problems in inconsistent databases [9] : (i) repair checking 
- finding if a given database is a preferred repair; (ii) computing consistent 
answers — finding if an answer to a query is present in every preferred repair. 

The main contributions of this paper are: 

• A general and intuitive framework for incorporating preferences into in- 
consistency handling based on the notion of priority. 

• A study of the semantic and computational properties of two instantiations 
of the framework: (locally preferred) f -repairs and (globally preferred) $- 
repairs. 

2 Basic notions and definitions 

In this paper, we work with databases over a schema consisting of only one 
relation R with attributes from U. We use A,B, . . . to denote elements of 
U and X, Y, . . . to denote subsets of U . We consider two disjoint domains: 
uninterpreted names D and natural numbers N . Every attribute in U is typed. 
We assume that constants with different names are different and that symbols 
= , <, > have the natural interpretation over N . 

The instances of R, denoted by r,r',. .., can be seen as finite, first-order 
structures, that share the domains D and N. For any tuple t from r by t.A 
we denote the value associated with the attribute A. In this paper we consider 
first-order queries over the alphabet consisting of R and binary relation symbols 
= , 7^, <, and >. 

The limitation to only one relation is made only for the sake of clarity and 
along the lines of [10] the framework can be easily extended to handle databases 
with multiple relations. 

2.1 Inconsistency and repairs 

The class of integrity constraints we study consists of functional dependencies. 
We use X — > Y to denote the following constraint: 

Vti,t 2 €R. f\ ti.A = t 2 .A=> f\ t 1 .B = t 2 .B; 

Aex BeY 

We use this formula to identify tuples creating conflicts. 

Definition 2.1 (Conflicting tuples). Given a set of functional dependencies F, 
two tuples ti,t2 are conflicting w.r.t F, denoted t\ <^> F i 2 , if and only if there 
exists a functional dependency X ^ Y E F such that t\.A = t%.A for all A £ X 
and t\.B ^ t 2 .B for some BeY. 
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Definition 2.2 (Inconsistent database). A database r is inconsistent with a 
set of constraints F if and only if r contains some conflicting tuples. Otherwise, 
the database is consistent. 

In the general framework when repairing a database we consider two op- 
erations: adding or removing a tuple. Because in the presence of functional 
dependencies adding new tuples cannot remove conflicts, we only consider re- 
pairs obtained by deleting tuples from the original instance. 

Definition 2.3 (Repair). Given a database r and a set of integrity constraints 
F, a database r' is a repair of r w.r.t. F if r' is a maximal subset of r consistent 
with F. 

We denote by Rep F (r) the set of all repairs of r w.r.t F. 

A repair can be viewed as the result of a process of cleaning the input 
relation. Note that since every conflict can be resolved in two different ways 
and conflict are often independent, there may be an exponential number of 
repairs. Also, the set of repairs of a consistent relation r contains only r. 

2.1.1 Conflict graphs 

Definition 2.4 (Conflict graph). [3] A conflict graph G r ^p is a graph whose 
set of vertices is equal to r and two tuples ti, t 2 are adjacent only if they are 
conflicting (i.e. t\ <-~> F t 2 )- 

Recall that a maximal independent set of a graph G is a maximal set of 
vertices that contains no edge from G. By MIS(G) we denote the set of all 
maximal independent sets of G. The following observation explains why the 
conflict graph is considered a compact representation of all repairs. 

Fact 2.5. For any database r and any set of functional dependencies F we have 
that 

Rep F (r) = MIS(G r , F ). 
2.2 Priorities and preferred repairs 

For the clarity of presentation we assume that from now on we work with a 
fixed database instance r and a fixed set of functional dependencies F. 

To represent the preference information, we use (possibly partial) orienta- 
tions of the conflict graph. It allows us to express preferences at the level of 
single conflicts. 

Definition 2.6 (Priority). A binary relation -<C r x r is a priority if: 
1. -< is asymmetric, i.e. 

Vx, y G r.-i[x ~< y A y -< x], 
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2. -< is defined only on conflicting tuples, i.e. 

Vx, y G r.x ~< y x <~~> p y. 

If x -< y we say that the pair {x, y} is prioritized and that y dominates over 
x. A priority -< is ioia/ if every pair of conflicting tuples is prioritized by -<. A 
priority -< is acyclic if there does not exist x G r such that x -<* x, where -<* is 
the transitive closure of -<. 

The first condition of priority demands the preference information to be 
unambiguous for a single conflict. The second condition ensures that we are 
given only the relevant preference information. If the second condition is not 
fulfilled, then it can be easily enforced by intersecting -< with ^~>p. 

This form of preference information allows us to easily define the the prefer- 
ence extension: we orient some conflicting edges that were not oriented before. 

Definition 2.7 (Priority extension). A priority -<' is an extension of a priority 
-< if -<' agrees with -< where -< is defined (i.e. -<' D -<). 

Note that -< cannot be extended further only if -< is total. Also an extension 
-<' of a priority -< is also a priority and therefore -<' is antisymmetric and defined 
only on pairs of conflicting tuples. 

Now we present two methods of using a priority to restrict the set of all 
repairs of a given relation. The first one, f -repairs, uses the priority to restrict 
the ways of constructing a repair (cleaning the database). The process consists 
of multiple iterative steps and in each of them only a limited number of conflicts 
is considered. The use of the priority has a local character because the subset of 
priority used in one step is not used in any further steps. The second method, 
^-repairs, uses the priority in a global fashion by selecting most preferred repairs 
according to an order induced by the priority. 

2.2.1 Locally preferred repairs 

Recall a general nondetcrministic procedure for constructing a maximal inde- 
pendent set of a graph: as long as the graph is not empty, we choose a vertex, 
add it to the constructed set, and remove the vertex and all its neighbors from 
the graph. Depending on the choices of vertices we make, we can construct any 
maximal independent set of the input graph. Now, let's look at this procedure 
from the point of constructing a repair. Each choice of a vertex corresponds to 
taking a single repair action: keeping the corresponding tuple in the relation 
and removing all tuples conflicting with it. 

Since the choice of the tuple to keep is unconstrained, every conflict can be 
resolved in several different ways. We use the priority to restrict the possible 
ways of choosing the tuple that will be kept and whose conflicts will be resolved. 
The chosen tuple is among those that are not dominated at the given step of 
the repairing process. We use the winnow operator [8] to formally describe the 
set of tuples that we choose from: 

w^(s) = {t<E s|-at' G s.t -< t'}. 
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Algorithm 1 implements the construction of preferred repairs. An (-repair (or 
a locally preferred repair) is any instance r' we can obtain with this Algorithm. 
We denote the set of all ^-repairs of r w.r.t. F and -< by LRep^(r). Note that 



Algorithm 1 Nondcterministic construction of an I -repair 
l: r' <- 

2: s <— r 

3: while ui^(s) /0do 
4: choose any x G uj^ (s) 
5: r' <- r' U {x} 

6: s <— s \ v(x) > where v(x) = {x} U {j/|x <-~»_f 2/} 

7: return r' 



an f -repair can be characterized by the sequence of choices made in the step 
4 in Algorithm 1 (however there can be more than one such sequence). This 
observation allows us to state an alternative definition of an f -repair. 

Proposition 2.8. Given a priority -<, a set of tuples X is an (-repair, if and 
only if there exists an ordering X\, . . . , x n of X such that for every i G {0, . . . , n— 
1} the following set is non-empty 

(X \ {xi, . . . , Xi}) nw x (r\ (v(xi) U . . . U v(xi))) 

and uo^{r \ (v(xi) U . . . U v(x n )) = 0. 

2.2.2 Globally preferred repairs 

The next construction uses the priority directly to compare two repairs. Intu- 
itively, one repair is better than another if all the differences between them are 
justified by the priority. Formally, we define ^-repairs in the following way. 

Definition 2.9 (Globally preferred repair). Given a priority -< and two repairs 
ri,T2 G Rep F (r), we say that r2 is preferred over n, and write n <r2, if 

Mx G r\ \ T2- 3y e r% \ r\. x -< y. 

A repair is a ^-repair (or a globally preferred repair) if it is a ^-maximal repair. 
By GRep^(r) we denote the set of all ^-repairs. 

This particular "lifting" of a preference on objects to a preference on sets 
of objects can be found in other contexts. For example, a similar definition is 
used for a preference among different models of a logic program [23], or for a 
preference among different worlds [19]. 

2.3 Consistent query answers 

In this paper, we use a generalized notion of consistent query answers. Instead 
of taking the set of all repairs, as in [1], we consider families of preferred repairs. 
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We only study closed first-order logic queries. We can easily generalize our 
approach to open queries along the lines of [1, 10]. For a given query tp we say 
that true is an answer to tp in r, if r \= tp in the standard model-theoretic sense. 

Definition 2.10 (H-Consistent query answer). Given a closed query <p and a 
family of repairs H C Rep F (r), true is the H- consistent query answer to a query 
ip if for every repair r' £ TL we have r' \= tp. 

Note that we obtain the original notion of consistent query answer [1] if we 
take for TL the whole set of repairs Rep F (r). 

In this paper, we study the cases when we take for TL either the set of (- 
repairs or the set of ^-repairs. This gives us two notions: 

1. (-preferred consistent query answer if TL = LRep^(r), 

2. ^-preferred consistent query answer if TL — GRep^(r). 

We write r \= C F <fi ( r H p -< <fr) to denote that true is the /'-preferred (resp. 
^-preferred) consistent answer to p (in r w.r.t. F and -<). 

3 Basic properties 
3.1 Cyclic priorities 

Before discussing specific properties of preferred repairs, we present reasons for 
removing cyclic priorities from consideration. 

Example 3.1. Assume a database schema R(A,B) and a set of functional 
dependencies F = {A — > B, B — > A}. Consider the following database 

r = {t a = (1,1), t b = (1,2), t c = (2,2), t d = (2,1)} 

and a total cyclic priority -<= {(t a , {%, t c ), (t c , to), (ta, t a )}. The set of all 
repairs is 

Rep F (r) = {n = {t a ,t c },r 2 = {t b ,t d }}. 

As we can easily find LRcp F (r) is empty. It is also easy to see that r\ r 2 and 
r 2 <C r\ and thus GRep F = 0. This violates the postulates V\ and "P4. 

Intuitively, a cycle in the conflict graph represents a mutually dependent 
group of conflicts (a solution of one conflict may restrict the ways of solving other 
conflicts). Our intention is to break the cycle by choosing a ^-maximal element. 
If < is cyclic, then such element does not exist, which makes the construction 
of a preferred repair impossible. We find this kind of preference information 
(cyclic priority) to be incoherent and we exclude it form our considerations. 
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3.2 Order properties of -C 

When we restrict our considerations only to acyclic priorities, the relation <C 
has interesting order properties. 

Proposition 3.2. If -< is an acyclic priority and the binary relation <C on 
Repp(r) is defined in terms of -< as in Definition 2.9, then 

1. <C is reflexive, 

2. <C is anti- symmetric, 

3. <C is transitive, provided that -< is transitive. 

Proof. Before proving the main thesis we will introduce one definition and show 
its two properties 

Definition 3.3 (Alternating chain). Given two sets A,BCr and a priority -<, 
an (A, B) -alternating -<-chain is a (possibly infinite) sequence ai,ct2, ■ ■ ■ such 
that: 

• every element with even index belongs to A 

Ot2*i € A 

• every element with odd index belongs to B 

Ci2*i+l S B 

• -< holds between every two consecutive elements, i.e. 

a.i -< a i+1 

We say that an (A, £?)-alternating ^-chain is maximal if it's not a proper prefix 
of some (A, £?)-alternating ^-chain 1 . 

When -< will be know from the context instead of saying that {on} is an 
(A, £?)-alternating -(-chain we will simply say that {oti} is an (A, £?)-chain. 

Proposition 3.4. For any acyclic priority -< and any two sets A,BQr every 
{A, B)-chain is finite. 

Proof. Suppose there exists such an infinite (A, £?)-chain {cti}. Because r is 
finite, {a,} contains a recurrent element x. Thus 

x -< . . . -< x. 

This gives us a contradiction with -< being acyclic. □ 

X A sequence {ai}™ =1 is a proper prefix of a sequence {bi}™ =l if and only if n < m and 
a; = bi for every i £ {1, . . . , n}. Note that can be infinite (m = oo), but an infinite 

sequence cannot have a proper prefix. 
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Proposition 3.5. For any acyclic priority -<, and any two sets X,Y C r such 
that X <C Y (where <C is defined in terms of -< ), any maximal (X \ Y, Y \ X)- 
chain is of even length (it ends with an element from Y\X). 

Proof. By previous proposition we have that any (X \ Y,Y\ A)-chain is finite. 
Assume now that, there exists a maximal (X \Y,Y \ X)-chain of odd length 
(i.e. ending with an element from X \ Y): 

(1) xi < yi < x 2 < y 2 < ■ ■ ■ < x k . 

Since X<F, there exists y k G Y \ X such that x k -< y k - Thus (1) is a prefix 
of the following (X \ Y, Y \ X)-chain: 

X! -< yi -< x 2 -< y 2 -< . . . <x k <y k . 

This contradicts the maximality of (1). □ 

We also state a trivial fact 

Fact 3.6. For any acyclic priority -<, any two sets X, Y C r such that 
and any x e X \ Y there exists an (X \ Y,Y\ X)-chain that starts with x. 

Now, we show the order properties of -C: 

1. <C is reflexive. 

Because universal quantification over empty set is true, then trivially X <C 
X for any set X C r. 

2. <C is asymmetric. 

Take two different sets X, Y C r such that I«7 and I«y, i.e.: 

(2) VxeX\ Y3y e Y \ X.x -< y, 

(3) Vy e Y \ X3x e X \ Y.y -< x. 

W.l.o.g we can assume that X \ Y ^ 0. Take any x\ G X \ Y . By (2) we 
are able to find y\ EY\X such that x\ -< y\. Now, by (3) we are able to 
find x 2 G X \ Y such that y\ -< x 2 . This way we can construct an infinite 
(X \ Y,Y \ X)-chain. This contradicts Proposition 3.4. 

3. If -< is transitive, then <C is transitive. 

Assume -< is transitive and take three different sets X,Y, Z C r such that 
X -C Y and Y -C Z (the case when two sets are equal is trivial). Note 
that: 

(4) \fx G X \ Y3y G Y \ X.x -< y, 

(5) Vy G Y \Z.3z e Z\Y.y -< z. 

Now we take any x G X \ Z and consider two cases depending if x G Y or 
not. 
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Suppose x G Y . Let x -< ... -<zbea maximal (Y \ Z, Z \ y)-chain where 
z E Z\Y . (the existence of such a chain is by Proposition 3.5 and Fact 
3.6). If there exists an element z' of this chain that belongs to Z \ X then 
by transitivity of -< we have x -< z' (which end this path of the proof). 
Suppose that none of the elements of the (Y\Z,Z\ y)-chain belongs to 
Z\X, then in particular z belongs to X\Y. By (4) there exists y G Y \X 
such that z ~< y. Moreover y G Z or otherwise we get a contradiction of 
the maximality of the (Y \ Z, Z \ y)-chain. By transitivity of -< we get 
x ~< y and obviously y G Z\X; 

Similarly we deal with the case when x $ Y. Take x -< ... ^ y to be a 
maximal (X \ Y, Y \ X)-chain, where y G Y \ X. If there exists an element 
z' of this sequence that belongs to Z \ X, then by transitivity of ~< we 
have x -< z' (which end this path of the proof). Suppose that none of the 
elements of the (X \ Y,Y \ X)-chain belongs to Z \ X, then in particular 
y belongs to Y \ Z. By (5) there exists z E Z \ Y such that y < z. 
Moreover z^Ior otherwise we get a contradiction of the maximality of 
the (X \ Y,Y \ X)-chain. Finally by transitivity of -< we get x < z and 
obviously z G Z\X. This ends the proof. 

□ 

The following example shows that <C may not be transitive if the underlying 
priority is not transitive. 

Example 3.7. Consider a database 

r = {t„ = (l,l),t 6 = (l,2),t c = (l,3)} 

over the schema R(A, B) with one functional dependency F = {A — > B} and 
with priority ~< = {(t a , t b ), (t b , t c )}. There are three repairs of r: 

Rep F (r) = {A= {t a }, B = {t b }, C = {t c }} 

The corresponding conflict graph is presented on Figure 1. We note that i<Cfi 




Figure 1: Conflict graph G r ,F with orientation -< 
and B < C but A<tC. 

3.3 Fulfillment of the postulates 

Before we prove the fulfillment of the postulates Vl-Vi we state an important 
property of the two instantiations of preferred repairs: constructing a repair 
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from the locally best tuples by the notion of f-repairs conforms with the global 
notion of preference (^-repairs). 

Theorem 3.8. If -<, is an acyclic priority, then 

LRep^(r) C GRep^(r). 

Proof. Induction over the size of r. Trivial for r = 0. 

Assume the hypothesis holds for any proper subset of r and there exists 
X £ LRep^(r) such that 1 C 7 for some Y £ Rep F (r). By Proposition 2.8 
w^(r) n X is non-empty. Take then any x £ co^(r) n X. x £ Y or otherwise 
we receive a contradiction X <C Y. Note that Y \ {x} is a repair of r \ v{x) 
and X \ {x} even a I -repair of r \ v(x). Moreover X \ {x} CF \ {x} in terms 
of the database r \ v(x). Thus X \ {x} is not ^-repair of r \ {x} 7 which is a 
contradiction of the inductive hypothesis. □ 

In the following example we observe that the reverse containment does not hold 
for an arbitrary acyclic priority, i.e. the construction of f -repairs by choosing 
only the best elements locally (as in f -repairs) may miss a ^-repair. 

Example 3.9. Consider a database 

r = {t a = (1, 1, 1), t b = (2, 1, 2), t c = (3, 1, 3), t d = (4, 1, 3)} 

over the schema R(A, B, C) with a set of functional dependencies F = {B — > C} 
and a acyclic priority 

■< = {(t c ,t a ),(t d ,t b )} 

The set of repairs is Rep f (r) = {n = {t a } 1 = {h}, ^3 = {t c , td}}- As we can 
easily find GRep^(r) = Rcp F (r). Because each of the t c and td is dominated, 
the ^-repair r^ is not an ^-repair, and thus LRcp F (r) = {n, r2}. 

Later on we present sufficient conditions under which both instantiations of 
preferred repairs are equivalent (Theorem 3.12). 

We recall that extending priority consists of prioritizing conflicts not prior- 
itized before and a priority that cannot be extended further (i.e. is maximal) 
is a total priority. Both classes of referred repairs that we consider satisfy the 
postulates PI - 7>4: 

Theorem 3.10 (Vl-VA for LRep). For every relation instance r, set of func- 
tional dependencies F, and acyclic priority LRep F (r) satisfies Vl-VA. 

Proof. We receive VI from the fact that if -<! is acyclic then u)^(X) is non-empty 
if and only if X is non-empty. 

V2 is implied by the fact that lo is an identity function what makes LRep 
a generic procedure for constructing all maximal independent sets of G r _c- 

To prove V3 assume that -<', -< are acyclic priorities such that -<'C~<. Take 
then any X £ LRep F (r) and let a be any ordering of X from Proposition 2.8. 
Note that since for any set A we have u>^(A) C u^i(A) then a also fulfills 
conditions of Proposition 2.8 in terms of -<'. 

TA is a consequence of PI for LRep, Theorem 3.8, and VA for GRep. □ 
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Theorem 3.11 (VI— VA for GRcp). For every relation instance r, set of func- 
tional dependencies F , and acyclic priority -<, GRcp^(r) satisfies Vl-VA. 

Proof. We get VI from the definition. 

With an empty priority we cannot justify X <C Y for any two different 
repairs X and Y, what implies VI. 

To show V3 assume that -<', -< are acyclic priorities such that -<!'C-<, X £ 
GRep^(r), and suppose there exists Y £ GRep^ (r) such that Y is preferred 
over X in terms of -<'. But since -<'C-< this implies that Y is also preferred 
over X in terms of -<. This is a contradiction. 

In order to prove VA assume there exist two different repairs X and Y in 
GRep^(r). X Y implies that there exists an element x £ X \ Y such that 
for any conflicting with x tuple y from Y \ X we have x -ft y. Since -< is total 
for any such y we have y <x. Take all such tuples yi, . . . ,y n and by T' denote 
any repair that contains the following elements 

Y\{ yi ,...,y n }U{x} 

Such a repair exists because this set contains no conflicting tuples. Obviously 
Y' ^Y and at the same time Y < Y'. This contradicts that Y £ GRep^(r). □ 

3.4 Equivalence of LRep and GRep 

As we showed in Example 3.9 LRep doesn't have to be equal to GRep. It 
suffices, however, to remove from consideration priorities with cyclic extensions 
to obtain the equivalence of the two notions of preferred repair: 

Theorem 3.12. If ~< is a priority having only acyclic extensions, then 

GRep^(r) = LRcp^(r). 

Proof. We need to show GRep^(r) C LRcp^(r). Take any X £ GRep^(r) and 
construct -<' a total extension of ~< by prioritizing (un-prioritized by -<) conflicts 
in favor for X, i.e. -<' is any total priority such that for any x £ X and any y 
if x <^> F y and x -fi y then y -< x. Since -< has only acyclic extensions -<' is 
acyclic. It should be clear from the construction that X £ GRep^ (r). By VI, 
V2, VA and Theorem 3.8 this implies that X £ LRep/(r). This by V3 gives us 
that X £ LRcp^(r). □ 

The following example shows, however, that the requirement of no cyclic exten- 
sions is not necessary for the equality above to hold. 

Example 3.13. Consider schema R(A, B, C) together with a set of functional 
dependencies F = {B — > C}. Suppose we have a database: 

r = {t a = (1, 1, 1), t b = (2, 1, 1), t c = (3, 1, 2), t d = (4, 1, 2)} 

with a priority -< — {(t c ,t a ),(td,tb)}. The conflict graph is presented on Figure 
2. -< has a cyclic extension -<' = -< U {(t a ,td),(tb,t c )}. At the same time 
LRep^(r) = GRep^(r) = {{t a ,t b }}. 
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t a tb 

txt 

t c td 



Figure 2: Conflict graph G t ^f with orientation -< 

4 Computational properties 

We study two fundamental problems of handling inconsistencies with priorities: 
(i) repair checking - determining if a database is a preferred repair of a given 
database; (ii) consistent query answers - checking if true is an answer to a given 
query in every preferred repair. We use the notion of data complexity [24] which 
captures the complexity of a problem as a function of the number of tuples in 
the database. The database schema, the integrity constraints, and the query 
are assumed to be fixed. 

4.1 Locally preferred repairs 

Recall Algorithm 1 and note that because the consecutive choices made in the 
step 4 consist of mutually non-conflicting tuples, the state of the computation is 
independent of the order of the choices 2 . Given a repair r', we can "simulate" its 
construction by restricting the choices in the step 4 to r'duj^ (r). The simulation 
succeeds if and only if r' is an ( -repair. 

Theorem 4.1. Given a fixed set of functional dependencies F, the set 
B c F = {(r,r',<)\r' G LRep^(r)} 

is in PTIME. 

It is shown in [9] that computing consistent answers to conjunctive queries 
is co-NP-complete, but if we consider only ground quantifier-free queries, the 
problem is in PTIME. On the other hand, computing f -preferred consistent 
answers turns out to be an intractable problem even if we consider very simple, 
single-atom queries. 

Theorem 4.2. There exists a set of four functional dependencies F and a 
quantifier-free ground query ip (consisting of one atom only) such that the set 

D c F , v = {(r,^)\r \=' F ^<p}, 

is co-NP-complete. 

2 The state of computation means the repair being constructed and the possible further 
choices. 
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Proof. It's easy to construct a nondeterministic Turing machine for F> Fip fol- 
lowing informal description presented here: The machine uses nondeterministic 
transitions to compute all I -preferred repairs of r and for each one checks the 
answer to <p. Note that 

r Hk-< <P e LRep^(r)./ |= ip <S=> ->3r' G LRep^(r).r' |= ->ip. 

This allows us to state that the constructed machine decides the complement of 

D l 

f,<p' 

Now, consider the schema R(Ai, B\, . . . , A4, B4) with the set of functional de- 
pendencies F = {Ai — ► Bi, . . . , A4 — ► B 4 } and a ground query -iR(b), where 
the value of b can be found in Table 2. 

We show here a polynomial reduction of the complement of 3SAT to D l ^ R ^ F , 
i.e. for any boolean formula ip in 3CNF we construct a pair (r v ,~< v ) of a 
polynomial size in the size of ip and such that 

(?>, < v ) 6 D l F ^ R(b) p> & 3SAT. 

Take then any formula ip in 3CNF and let n be the number of variables used 
in ip and k the number of conjuncts of <p. For simplicity we assume that: 

• used variables have consecutive indexes x\, . . . , x n , 

• ip — C\ A . . . A Ck 

• each conjunct consists of exactly three literals Cj = !y V lj 2 V lj 3 for 
(j = l,...,k). 

We define two auxiliary functions var and sgn on literals in the following fashion: 

var(xi) = i, sgn(xi) = 1, 

var(->Xi) = i, sgn(->Xi) = — 1. 

The constructed database contains the following elements: 

r v = {v 1 ,v 1 , . . .,v n ,v n ,di ... ,4,6} 

whose exact values can be found in Table 2. The priority relation -< v is the 
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Table 2: Values of tuples in r, 
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unique minimal binary relation on r v satisfying the following conditions: 
dj <<p v var( i jti) , for j G {1, . . . , k}, i G {1, 2, 3} such that sgn(l jti ) = 1, 
dj < v Vyaril^i), for 3 G {1, . . . , fc}, i G {1, 2, 3} such that sgn(l j4 ) = -1, 
6 dj, for j G {1, . . . , fc}. 

Note that this priority relation is acyclic. Also note that construction of (r v , ~< v ) 
can be implemented in time polynomial in the size of the of the input formula ip. 
On Figure 3 we can find a conflict graph of an instance received from reduction 
of a formula ip = (->xi V x 2 V x 3 ) A (x 3 V ^x 4 V x 5 ) A (-1X5 V ~^x & V x 7 ). 

Ul V\ V2 V2 V3 W 3 W4 W 4 V 5 v 5 Vq Vq V-j V 7 




b 



Figure 3: Conflict graph for ip = (-1x1 VX2 Vx 3 )A(x 3 V-^ Vx5)A(-iX5 V^x^x?) 
and orientation ~< lf> . 

Now, we show that 

3r' G LRepp* {r v ).b G r' ^ <p G 3SAT 

I =» I Fist note that since b G r' then none of the tuples di, . . . , dfe belongs to r'. 
Therefore for every i g {1, . . . , n} either or belongs to r'. Thus the 
following is a proper definition of a boolean valuation: 



V(xi) 



{true if Vi G r' 
false if Vi G r' 



Next, we show that ip is true for V. Suppose otherwise, i.e. there exists 
a conjunct c m that is not true for V. W.l.o.g. we can assume that c m — 
x\ V^x 2 Vx 3 . This implies that {vi,V2, v 3 }r\r' — and thus vi,v 2 ,v 3 G r'. 

Take ti, . . . , t n to be the ordering of r' from Proposition 2.8. Since no dj 
tuples are present in r', and the tuple 6 is dominated by every dj tuple 
(which in turn is dominated by some Vi and Vi tuples) then t n = b. Let s 
be the last index of this sequence that t s is equal to either v\, v 2 , or v 3 . 
Since d m is dominated only by v\, v 2 , and v 3 we have for any p > s 

d m G^(rv\(w(ti)U... Uv(t p ))). 
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This implies that w^(r v \ (v(ti) U . . . U v(t n ))) ^ which gives a con- 
tradiction. 

Take any valuation V for which ip is true and construct the following set 

r> = {b}\J{v i \V(x i )}U{v i \^V(x i )}. 

First, note that r' is a repair: it contains no conflicting tuples and for 
every tuple from r v \ r' there exists a conflicting tuple in r' . 

Next, we show that r e LRep^ (r v ). In order to prove that we note that 
for any subset X C r' \ {b} we have 

(6) dj g u> <v I r> \ |J v(x) J , for j = l,...,k. 

\ xex ) 

Suppose otherwise, i.e. there exists a set X C r' \ {b} and m such that 



W.l.o.g. we can assume that c m = x\ V -1x2 V X3. From the construction 
of r v and this implies that vi,V2, S3 £ which is equivalent with 
V{xi) — false, V(x2) — true, and ^(2:3) = false. This implies that c m 
is not true for V which yields a contradiction with ip being satisfied by V. 

The property (6) allows us to use Proposition 2.8 (take any ordering of r' 
with b on the last position) to state that r' is f-preferred repair w.r.t F 
and -< v . 

It should be noted here that adding just one tuple b' = (0,0,0,1,0,1,0,1) 
and extending the priority with b 1 < v b constructs a reduction of 3SAT to 
the complement of D l FR ^ b ,y And therefore computing f-preferred consistent 
answers is intractable also for a query consisting only of one positive literal. □ 

4.2 Globally preferred repairs 

Unlike f-repairs, the notion of ^-repairs, because of its global character, cannot 
be captured without an essential use of nondeterminism. 

Theorem 4.3. There exists a set of five functional dependencies F such that 
the set 

B s F = {{r,r',<)\r' e GRep^(r)} 

is co- NP- complete. 

Proof. It's easy to construct a nondeterministic Turing machine Bp. The ma- 
chine first checks if r' is a repair; if yes the machine nondeterministically com- 
putes every repair and checks if any of them (different than r') is preferred over 
r' w.r.t. -<;. This machine decides the complement of Bp. 
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Now, we show that the problem co-NP-hard by reducing the complement 
of 3SAT to Bp. Consider the database schema R(A\, B\, . . . , A$, -B5) with 
the following set of integrity constraints F = {A\ — ► B\,...,A& — > B5}. For 
any boolean formula tp in 3CNF we construct a triple (r^,, ^I^,) of size 
polynomial in the size of tp and such that 

(r Vl X Vl -< v ) e B 9 F <^=> <p#3SAT. 

Moreover the reduction can be implemented in time polynomial in the size of 
tp. 

Take then any formula <p in 3CNF and let n be the number of variables 
used in tp and k the number of conjuncts of tp. For simplicity we assume that: 

• used variables have consecutive indexes x\, . . . ,x n , 

• tp — ci A . . . A Cfe 

• each conjunct consists of exactly three literals Cj = lj 1 V lj 2 V Zj 3 for 
(i = l,...,fc). 

We define two auxiliary functions var and sgn on literals as follows: 

var(xi) = i, sgn(xi) = 1, 

var{-^Xi) = i, sgn(^Xi) = —1. 

The constructed database contains the following elements 

r v = {v u wi,..., v n ,v n ,w 1 ,...,w n ,d 1 ,..., d k , s,t}, 

whose exact values can be found in Table 3. 
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Table 3: Values of tuples in r v 

The set X v consists of the following elements 

X v = {wi, ...,w n ,d 1 ,.. .,d n ,s}. 

It's easy to note that X v is a repair of r v w.r.t. F. Clearly X v C r ¥ , no two 
elements of X v are conflicting, and for every element from the set r v \ X v there 
exists a conflicting element from X v (s for t and Wi for Vi or v{). 
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The priority relation -< v is the unique minimal binary relation on r v satisfying 
the following conditions: 

for i £ {1, . . . , n}, 
for i £ {1, . . . , n}, 

dj -<tp Vi, if Cj uses a positive literal Xi, 

dj ~<<p Vi, if Cj uses a negative literal ->Xi. 

Note that this priority relation is acyclic. Also note that the triple (r v , X v , -< v ) 
can be constructed in the time polynomial in the size of the formula tp. On 
Figure 4 we can find a conflict graph of the instance received from reduction of 
the formula tp = (x\ V -1X2 V X3) A (-1X2 V -1X3 V 24). 




Figure 4: Conflict graph for <p — (x\ V^X2 Va^) A(-nz;2 V^X3 VX4) and orientation 

Now, we show that for any ip using variables x\, . . . ,x n the following holds 

X v £ GRep^(rv) e 3,5 AT. 

I <^= I Suppose G 3SAT and take V : {x\, . . . , x n } — > # to be the valuation for 
which is true. Consider the following set 

Y v = {<} U {vi\V( Xi )} U {«i|-.V(a;i)} 

It's easy to find that Yy is a repair and moreover X v <C IV- Thus X v is 
not a maximally ^-preferred repair. 

I => I Suppose X v g" GRep^(r), i.e. there exists Y G Rep F (r) such that X -C Y 
and y^I. 

First note that 4 6 7. Otherwise for Y to be preferred over X the tuple s 
has to be contained in Y because there is no element dominating s except 
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for t. Since s is adjacent with every Vi and Vi then also none of Vi and 
belongs to Y. This implies that Y = X which is a contradiction. 

Since t is adjacent to every element of X v and t 6 Y the sets Y and X v 
are disjoint. This implies that for every i the set Y contains either Vi or 
Vi (from maximality, independence, and the fact that X <C Y). 

Take now the following boolean valuation 



We show that Vy is a valuation for which p is true. Suppose otherwise, 
that there exists a conjunct c m that is not true under Vy- W.l.o.g we can 
assume that c m = x\ V ~^x 2 V x 3 . This implies that {^1,^2,^3} C\Y = 
0. From the construction of < v we know that there are no elements 
dominating over d m except for Vi,v~2,V3. And since obviously d m £ X\Y, 
we receive X ^ Y which is a contradiction. 



Using the notion of ^-repairs also leads to a significant increase of computational 
complexity when computing ^-preferred consistent query answers. 

Theorem 4.4. There exists a set of four functional dependencies F and a 
quantifier-free ground query p (consisting of one atom only) such that the set 



is Ilf- complete. 

Proof. The membership of Dp in II2 follows from the definition of ^-preferred 
consistent query answer: query is not ^-consistently true if it is false in some g- 
repair, and checking if a given set is a ^-repair is in co-NP. We show n^-hardness 
below. 

Consider a quantified boolean formula tp of the form 

(7) il>=Vx 1 ,...,x n 3y 1 ,...,y m .(f>, 

where 4> is quantifier- free and is in 3CNF, i.e 4> equals to c\ A . . . A c s , and 
Cfc are clauses of three literals lk,i V lk,2 V ^3. We will construct a database 
instance r^ (over the schema R(Ai,B\, . . . )) and a priority relation such 
that true is a ^-preferred consistent answer to a query R(Y) if and only if -0 is 
true (the value of Y can be found in Table 4). The set of integrity constraints 
is C = {A 1 B U ...,A4 S 4 }. 

We define two auxiliary functions var and sgn on literals in the following 
fashion: 

var(xi) = var(-iXi) = i, sgn(xi) — sgn(yj) = 1, 




□ 



D° <P , F = {(r,^)\rhF^ V) 



var{y 3 ) 



var(^yj) = n + j, 



sgn(^Xi) = sgn(^yj) = -1. 
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Now, we describe the tuples contained in r^. 
»> = {Pi 

di . . . , d s }. 

The exact values of tuples can be found in Table 4. The priority relation is 
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Table 4: Values of tuples in 
the unique minimal priority relation that satisfies the following conditions: 



dk <i> Pi, 


if Ck uses a positive literal Xi 1 


dk <ip Pi, 


if Ck uses a negative literal ~^Xi, 


dk <ip qj, 


if Ck uses a positive literal yj, 


dk ~<ip q~j, 


if Ck uses a negative literal ->j/j , 


Pi <i, Y, 


for alH G {1, . . . , n}, 


Pi Y, 


for alH G {!,..., n}, 







In Figure 5 we can find a conflict graph of an instance obtained from the 
reduction of a formula 

Vxi, x 2 , x 3 3yi,y 2 .(^x 1 V y\ V x 2 ) A (-.x 2 V ~^y 2 V ^x 3 ). 

We partition the set of all repairs of into two (separate) classes: 

1. J 7 - repairs: repairs that contain Y. 

2. <Y-repairs: repairs that don't contain Y. 

We will use X- and ^-repairs to 'simulate' all possible valuations of variables 
X\, . . . , x n and y 1 ,...,y m respectively. 

^-repairs 

Because of the functional dependency A\ — ► B\ a repair is 3^-repair if and only 
if it contains any of qj or q~j . Moreover for any 3^-repair r' and for any j either 
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Figure 5: Conflict graph for \/x±, X2, X3.3yi, y2-(~<xi Vyi VX2) A (~>X2 V-13/2 V-o^) 
and orientation The conflicts generated by A\ — > £>i arc marked with 

dotted lines. 



or belongs to r' . Therefore there is one-to-one correspondence between y~ 
repairs and valuations of yj variables. To easily move from the world of repairs 
to the world of valuations and vice versa we define the following two operators 
(for r' being a ^-repair and V being a valuation of variables in <fi) : 

Vy[r'}( yj ) = q _ j I T \ ry[V] = {q 3 \V \= Vj } U {qj\V \= ^} U {Y}. 

I false qj E r 

^-repairs 

We will partition further the class of A"-repairs depending on their 'conformance' 
with 4>. Because ^-repairs will correspond only to valuations of Xj we remove 
any usage of yj from tp in the following way: 



Vj = ^Vj 


= false. 




— Xi , 


->Xi 


= ^Xi, 




= h,i V h,2 V k 


4> 


— Ci A . . . A c s . 



For a given valuation of xi construct the following set of tuples: 

rx[V] = { Pi \V h U {Pi\V h ~-Xi} U {d k \V c k } U {X}. 

It's easy to verify that r^-[V / ] is a A'-repair. An ^-repair r' is strict if and only 
if there exists a valuation V such that r' = rx [V] . Otherwise the ^-repair is 
non-strict. 
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It's clear that there is a one-to-one correspondence between strict A 7 - repairs 
and valuations of X{. Construction of a valuation of Xi from a strict X- repair r' 
is also straightforward, for technical reasons we extend it to any ^-repair: 



V x [r']( Xi ) 



true pi G r' 
false pi G r' 
_ false otherwise 



Note that .Y-repairs can be characterized in a alternative way: 

Proposition 4.5. A repair of r^ is an X -repair if and only if it contains X. 

In the main proof we use only strict X-repairs. The following observation 
will allow us to remove non-strict repairs from consideration. 

Claim 4.6. Strict X -repairs are ^-maximal X-repairs. 

Proof. First we show how for any non-strict A^-repair r' we construct a (strict) 
A^-repair r" such that r' -C r" . Take the valuation V = Vx[r'] and let r" = 
rx[V]. The repair r" is strict and therefore r' ^ r". We show that r' <C r" , i.e. 

\/ter'\r"3t' er"\r'.t<t'. 

There are three cases of values of t to consider: 

1° X G r' \ r". Implies that r" is not an A'-repair, a contradiction. 

2° For some i we have p, G r' \ r" or pi G r' \ r". W.l.o.g assume that 
Pi G r' \ r" . This implies that V(x\) — true. From construction of t\y[V] 
this implies that p\ G r" , a contradiction. 

3° For some k we have dk G r' \ r". W.l.o.g. assume that k = 1 and 
a = x\ V yi V ^x 2 . Then p\ r' and p 2 £ r' (it's the neighborhood of 
di). From the construction of r" we have that 

di g r" ^=^> V y= ci ^=^> V \= xi or V \= ~^x 2 p\ G r" or p 2 G r" . 

And both pi and p 2 dominate over d\ . 

Now, suppose that there exists a strict A"-repair r' such that there exists an 
A"-repair r" preferred over r'. We show that r' = r". Note that r' and r" must 
agree on the tuples corresponding to the valuation of variables x\, . . . ,x n , i.e. 

r' {pi,pl, . . . ,Pn,Pn} = r" n {pi,pl, ■ ■ ■ ,Pn,Pn}- 

Since r' is strict, its content is determined by the corresponding valuation of 
variables x\ , . . . , x n . Therefore r' — r x [Vx [r"]] . We showed in the previous part 
of the proof that r" r'. Since is acyclic this implies that r' = r" . □ 

Claim 4.7. For any valuation V of Xi and yj we have rx[V] -C ry[V] if and 
only ifV\=4>. 
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Proof. We prove implication in two directions: 

| 4= | By contradiction. Suppose V \= <j> and there exists a tuple t of t\y[V] which 
is not dominated by any tuple from ry[V]. Obviously (from dependency 
Ai — > Bi) t can be only one of dk- W.l.o.g. assume that k = 1 and 
ci = x\ V yi V ~^X2- By construction of r$ this implies that p\ g" rx[V], 
qi ry [V] , and p2 rx [V] . From the definition of rx [V] and ry [V] we 
receive that V(x\) — false, V(x 2 ) = true, and V(y\) = false. This gives 
us V y= Ci which is a contradiction. 

| =» | By contradiction. Suppose rx[V] <C ry[V] and there exists conjunct 
such that V \/= Ck- W.l.o.g. assume that k = 1 and c\ — x\ V y\ V ~^X2- 
Then V{x\) = false, V{x2) = true, and V{y\) = false. Consider d\ and 
note that it belongs to rx [V] (by definition of rx) ■ From the construction 
of wc know that only p\, p2, and q\ dominate over d\. Vy\V] doesn't 
contain any of those and this gives us a contradiction. 

□ 

Proposition 4.8. QBF ip is true if and only if for any strict X-repair r' there 
exists a y -repair r" such that r' <C r" . 

By Claim 4.6 we have that only a ^-repair can be more preferred than a strict 
.^-repair and for any non-strict .Y-repair there always exists a more preferred 
repair. 

Corollary 4.9. QBF ip is true if and only if for any X-repair r' there exists a 
different repair r" such that r' <C r" . 

From the partition of repairs we know that ^-repairs can be characterized 
with a formula ->R(Y). 

|=V \=Vxi,...,x n 3y 1 ,...,y m .<l> ^> 
Vr' G Rep F (r>). [r \= ~>R(Y) 3r" G Rep F (r^).r' ^ r" Ar'« r"} . 
Vr' G Rep F (r v ,).[^3r" G Rcp F {r^).r' ^ r" Ar'« r"] => r' \= R(Y) <^ 
Vr' G GRep^ v ' (r^).r' (= fl(F) 
( r V>> ~^V>) e Dp r( y) 

Corollary 4.10. QBF V is irite */ 0?l ^ if true is ^-preferred consistent 
answer to R(Y) in r^ w.r.t. F and -<^. 

If we use as characterization of ^-repairs the formula R(X) then we can 
reduce QBF to answering to a query with one negated atom. 

Corollary 4.11. QBF ifr is true if and only if true is g -preferred consistent 
answer to -<R(X) in r^ w.r.t. F and -<^. 

□ 
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4.3 Database cleaning 

The postulate "P4 allows us to think of a total acyclic priority as a cleaning 
program — an exact specification of how to resolve all conflicts. To run this 
program we simply use Algorithm 1 and obtain a unique /'-preferred repair. 
Thanks to Theorem 3.12, this is also the unique ^-repair. 

Proposition 4.12. Given a a total acyclic priority -<, the unique [-repair 
(which if also the unique g -repair) can be computed in time polynomial in the 
size of the database. 

5 Related work 

We limit our discussion to work on using priorities to maintain consistency and 
facilitate resolution of conflicts. 

The first to notice the importance of priorities in information systems is 
[12]. The authors study there the problem of updates of databases containing 
propositional sentences. The priority is expressed by storing a natural number 
with each clause (the integrity constraints should be tagged with the highest 
priority 0) . If an update (inserting or deleting a sentence) leads to inconsistency, 
among all consistent and realizing the update databases the minimally different 
are selected. A database E is less different than a database F w.r.t. D if either 
for some i S {0, 1, . . . , n} 



where n is the lowest priority in D and D k consists of all sentences from D 
with priority less or equal to k. Although this framework does not define a 
notion of a conflict, we note that more than two facts can create a conflict 
w.r.t some constraint. For sake of the comparison, assume that the conflicts are 
generated only by pairs of facts (together with one of the constraints). Then, 
the selected minimally different consistent databases are equivalent to ^-repairs 
(and because the considered class of priorities has only acyclic extensions it is 
equivalent to f -repairs). We note, however, that the chosen representation of 
priorities imposes a significant restriction on the class of considered priorities. In 
particular it assumes transitivity of the priority on conflicting facts i.e. if facts 
a, 6, and c are pair-wise conflicting and a has a higher priority than b and b has 
a higher priority than c, then the priority of a is higher than c. This assumption 
cannot be always fulfilled in the context of inconsistent databases. For example 
the conflicts between a and 6, and between b and c may be caused by violation 
of one integrity constraints while the conflict between a and c is introduced 
by a different constraint. While the user may supply us with a rule assigning 
priorities to conflicts created by the first integrity constraint, the user may not 
wish to put any priorities on any conflicts created by the other constraint. 

A similar representation of priorities used to resolve inconsistency in first- 
order theories is studied in [6] , where the inconsistent set of clauses is stratified 




or 
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(again the lowest strata has the highest priority). Then preferred maximal 
consistent subtheories are constructed in a manner analogous to I -repairs. Fur- 
thermore, this approach is generalized to priorities being a partial orders, by 
considering all extensions to weak orders. Again, however, this approach as- 
sumes transitivity of priority on conflicts, which as we explained previously 
may be considered a significant restriction. 

In [21] priorities are studied to facilitate the process of belie] revision. A 
belief state is represented as an ordered list of propositional formulae and the 
revision operation simply adds the given sentence at the end of the given belief 
state. This representation of belief state allows to keep track of revision history, 
which is later used to impose a preference order on the possible interpretations 
of the belief state. Only maximally preferred interpretations are used when 
defining the entailment relation. 

In the context of logic programs, priorities among rules can be used to han- 
dle inconsistent logic programs (where rules imply contradictory facts). More 
preferred rules are satisfied, possibly at the cost of violating less important ones. 
In a manner analogous to -C, [23] lifts a total order on rules to a preference on 
(extended) answers sets. When computing answers only maximally preferred 
answers sets are considered. 

[22] investigate disjunctive logic programs with priorities on facts. The au- 
thors use a transitive and reflexive closure (denoted here ■<) of a user supplied 
set of priorities on facts. The preference on answer sets C is defined as follows: 

• X C X for every answer set X 

• X C Y if 

By e Y \ X .hx e X \ Y.x r< y A -Bat 6l \ Y.y -< x' 

where x ~< y stands for x ^ y A y ■£ x. 

• if X C Y and Y C Z, then X C Z. 

The answer to a program in the extended framework consists of all maximally 
preferred answer sets. The main shortcoming of using this framework is it's com- 
putational infeasibility (which is specific to decision problems involving general 
disjunctive programs): computing answers to ground queries to disjunctive pri- 
oritized logic programs under cautious (brave) semantics is Ilf-complete (resp. 
Sg-complete). 

A simpler approach to the problem of inconsistent logic programs is pre- 
sented in [18]. There conflicting facts are removed from the model unless the 
priority specifies how to resolve the conflict. Because only programs without 
disjunction are considered, this approach always returns exactly one model of 
the input program. Constructing preferred repairs in a corresponding fashion 
(by removing all conflicts unless the priority indicates a resolution) would simi- 
larly return exactly one database instance (fulfillment of PI and VA). However, 
if the priority does not specify how to resolve every conflict, the returned in- 
stance is not a maximal set of tuples and therefore it is not a repair. Such an 
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approach leads to a loss of (disjunctive) information and violates postulates V2 
and V3. 

[13] proposes a framework of conditioned active integrity constraints, which 
allows the user to specify the way some of the conflicts can be resolved. This 
notion syntactically extends the notion of embedded dependency MX\(j) D BY.ip], 
where X and Y are sets of variables, <f> and are two conjunctions of literals, 
and each of existential variables Y is used only once. A conditioned active 
integrity constraint is obtained by adding a disjunctive list of update atoms 
(+Ci, . . . , +Ck for adding, and — D k +i, . . . , — D n for deletion) together with 
conditions 9\ , . . . , n specifying when a corresponding update atom can be used. 
Such an extended constraint is denoted as 

\/X.[{cj> D 3YiP) D 6 l : +d V . . . V 6 k : +C k V 9 k+1 : -D k+1 V . . . V 6 n : -D n ] 

A constraint (or rather its grounded version) is said to be applied to by a repair 
if the original integrity constraint (<j> C BY.ip) is satisfied in the database and 
the repair is obtained by performing updates satisfying the conditional update 
atom lists (one of the atoms Ci, . . . .C k has been added and the corresponding 
condition 6\, . . . ,9k is satisfied, or one of the atoms C k +i, ■■ - ,C n has been re- 
moved and the corresponding condition k +i, ■ ■ ■ , 6 n is satisfied). On all repairs, 
which are obtained in the standard way by taking as integrity constraints only 
the heads of the conditioned action integrity constraints, we define relation of 
preference: a repair r x is preferred over r 2 if every (ground) constraint applied 
in n is also applied in r^. We note here that when restricted to functional 
dependencies the set of preferred repairs is a superset of f-repairs. Inclusion 
in the other direction doesn't always hold, which is illustrated on the following 
example. 

Example 5.1. Consider a database R(A\, B\, A%, B2) consisting of three tuples 
r = {ti = (1, 1, 0, 0), t2 = (1, 2, 3, 3), t 3 = (0, 0, 3, 4)} and suppose we work in 
the presence of two functional dependencies Ai — > B\ and A 2 — > B 2 - Suppose 
also, that the user specifies that if two tuples are conflicting w.r.t. the FD 
A\ — > B\, then the tuple with higher value of the field B\ should be preferred 
when repairing the database. A similar wish is expressed for conflicts generated 
by the second functional dependency. This can be expressed using the following 
two conditioned active integrity constraints 

Vx,y 1 ,y 2 ,z 1 ,z 2 ,s 1 ,s 2 .[{R{x,y 1 ,z 1 ,s 1 ) AR(x,y 2 ,z 2 ,s 2 ) D yi ± yi) D 

Vi > V2 ■ -R(x,y 2 ,z 2 ,s 2 )], 

^x 1 ,x 2 ,y 1 ,y 2 ,z,s 1 ,s 2 .[(R(x 1 ,y 1 ,z,s 1 ) A R(x 2 ,y 2 ,z,s 2 ) D Si ^ s 2 ) D 

si > s 2 : -R(x 2 ,y 2 , z, s 2 )}. 

After grounding we remove constraints with their head equal to false and we 
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obtain the following set 



(II) 
(12) 

(13) 
(14) 



#(1,1,0,0) A #(1,2,3,3) D 1 > 2 
#(1,2,3,3) A #(1,1,0,0) D 2 > 1 
#(1,2,3,3) A #(0,0, 3, 4) D 3 > 4 
#(0,0,3,4) A #(1,2, 3, 3) D 4 > 3 



-#(1,2,3,3), 
-#(1,1,0,0), 
-#(0,0,3,4), 
-#(1,2,3,3). 



The corresponding priority relation is -< = {(t\, t 2 ), fe, £3)}- Note that in the 
context of the database r, the user has provided information sufficient to solve 
all the conflicts, i.e. among the repairs Repp(r) = {r\ — {t\,ts},r2 — {^2}} 
the repair n is the unique repair selected by LRep^. At the same time only 
(12) is applied to n and only (14) is applied to r%, what makes both repairs 
incomparable in terms of the framework of [13]. 

This example also shows that the discussed framework violates the postulate 
V3. Note also that removing preference information on how to resolve the 
conflict between i 2 and £3 will yield only one repair r\. This shows that this 
framework violates the postulate Vi. At the same time this framework fulfills 
the property of conservativeness (the preferred repairs are a subset of standard 
repairs) and non-emptiness (there is always at least one preferred repair). [13] 
also describes how to translate conditioned active integrity constraints into a 
prioritized logic program [22], whose preferred models correspond to maximally 
preferred repairs. Note that the framework of prioritized logic programming is 
computationally more powerful (answering answers under the brave semantics 
is Sg-complete) than required by the problem of finding if an atom is present 
in any repair (E^-complete) . It is yet to be seen if less powerful programming 
environment (like general disjunctive logic programs) can be used to compute 
preferred answers. 

[20] uses ranking functions on tuples to resolve conflicts by taking only the 
tuple with highest rank and removing others. This approach constructs a unique 
repair under the assumption that no two different tuples are of equal rank 
(postulates VI and V4). If this assumption is not satisfied and the tuples 
contain numeric values, a new value, called the fusion, can be calculated from 
the conflicting tuples (then, however, the constructed instance is not a repair in 
the sense of Definition 2.3). 

A different approach based on ranking is studied in [17]. The authors con- 
sider polynomial functions that are used to rank repairs. When computing 
preferred consistent query answers, only repairs with the highest rank are con- 
sidered. The postulates VI and V2 are trivially satisfied, but because this form 
of preference information does not have natural notions of extensions and max- 
imally, it is hard to discuss postulates V3 and P4. Also, the preference among 
repairs in this method is not based on the way in which the conflicts are resolved. 

An approach where the user has a certain degree of control over the way 
the conflicts are resolved is presented in [16]. Using repair constraints the user 
can restrict considered repairs to those where tuples from one relation have been 
removed only if similar tuples have been removed from some other relation. This 
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approach is monotonic, but not necessarily non-empty. The authors propose 
method of weakening the repair constraints to restore non-emptiness, however 
this comes at the price of losing monotonicity. 

6 Conclusions and future work 

In this paper we proposed a general framework of preferred repairs and pre- 
ferred consistent query answers by formulating a set of intuitive postulates. We 
proposed two instantiations of the framework and studied their semantic and 
computational properties. Table 5 summarizes the computational complexity 
results; its first row is taken from [9]. 





Repair 
Check 


Consistent Answers to 




{V, 3}-frcc 
queries 


conjunctive 
queries 


All repairs 


PTIME 


PTIME 


co-NP-completc 


/"-repairs 


PTIME 


co-NP-completc 


^-repairs 


co-NP-complete 




complete 



Tabic 5: Summary of complexity results 

We envision several directions for further work. The postulates 7- > l-P4 can 
be refined, so that only non-trivial instantiations are captured. For example, 
the following instantiation fulfills the postulates: we ignore any priority which 
is not total and return all repairs in this case; when the priority is total we 
return the unique f -repair. This approach, however, is trivial and obviously 
does not increase the computational complexity of any of considered problems. 
Also, the computational consequences of further refining the postulates should 
be examined. 

Along the lines of [3] , the computational complexity results could be further 
studied, by assuming a limit on the number of functional dependencies or their 
conformance with BCNF. 

The last is generalization of our framework to broader class of constraints. 
Conflict graphs can be generalized to hypergraphs [9], which allow to handle 
broader class of denial constraints. Then, more than two tuples can be involved 
in a single conflict and the current notion of priority does not have a clear 
meaning. 
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